Applying pitch-dependent difference detection and modification to emotional speaker recognition
نویسندگان
چکیده
Emotion is an internal source, which can cause the speaker recognition system performance degradation by inducing extra intra-speaker vocal variability. Several enhancements have been applied to speaker recognition system under emotional speech. However, these methods suffer from the limitation of requiring the emotional speech in training or the emotion state of the speaker in testing. This paper presents a novel approach based on the Pitch-dependent Difference Detection and Modification (PDDM) to overcome the limitation above. In this method, only the neutral speech is used to train the speaker models and the emotional state information is not needed in the testing. Experimental results on MASC show that this method enhances identification rate by 4.7% in the best case compared to the traditional speaker recognition.
منابع مشابه
MFCC based Enlargement of the Training Set for Emotion Recognition in Speech
Emotional state recognition through speech is being a very interesting research topic nowadays. Using subliminal information of speech, denominated as “prosody”, it is possible to recognize the emotional state of the person. One of the main problems in the design of automatic emotion recognition systems is the small number of available patterns. This fact makes the learning process more difficu...
متن کاملStatistical Variation Analysis of Formant and Pitch Frequencies in Anger and Happiness Emotional Sentences in Farsi Language
Setup of an emotion recognition or emotional speech recognition system is directly related to how emotion changes the speech features. In this research, the influence of emotion on the anger and happiness was evaluated and the results were compared with the neutral speech. So the pitch frequency and the first three formant frequencies were used. The experimental results showed that there are lo...
متن کاملSynthetical Enlargement of Mfcc Based Training Sets for Emotion Recognition
Emotional state recognition through speech is being a very interesting research topic nowadays. Using subliminal information of speech, it is possible to recognize the emotional state of the person. One of the main problems in the design of automatic emotion recognition systems is the small number of available patterns. This fact makes the learning process more difficult, due to the generalizat...
متن کاملText-independent Speaker Identification Based on MAP Channel Compensation and Pitch-dependent Features
One major source of performance decline in speaker recognition system is channel mismatch between training and testing. This paper focuses on improving channel robustness of speaker recognition system in two aspects of channel compensation technique and channel robust features. The system is text-independent speaker identification system based on two-stage recognition. In the aspect of channel ...
متن کاملApplying Score Reliability Fusion to Bi-Model Emotional Speaker Recognition
Emotion mismatch between training and testing is one of the important factors causing the performance degradation of speaker recognition system. In our previous work, a bi-model emotion speaker recognition (BESR) method based on virtual HD (High Different from neutral, with large pitch offset) speech synthesizing was proposed to deal with this problem. It enhanced the system performance under m...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008